With the pandemic in full swing, craft breweries across the nation are closing their doors. Social distancing and other precautionary stay at home orders have forever changed the market for craft beer. This presents a unique opportunity to Anheuser-Busch InBev. In many ways, the market is wide open. Craft beer, especially IPA’s and ales in the US can be shown to follow certain trends with regards to bitterness and ABV. Adhering to these trends may help ensure, when choosing which breweries to procure for example, that an investment is successful.
With the craft beer data provided by Anheuser-Busch InBev, this report will depict the apparent relationship between alcohol by volume(ABV) and international bitterness units(IBU) for myriad beers across the United States. The report also provides summary statistics such as minimums, medians and maximums with respect to ABV and IBU, as well as a deeper look in to the difference between IPA’s and “Other Ale’s”(any beer with Ale in the name) with respect to ABV and IBU. Our analysis reveals information that could be useful to Anheuser-Busch InBev concerning beer volumes in ounces as they relate to each state of the US.
The beers and breweries dataset provided by Anheuser-Busch InBev contain information about 2410 US craft beers and 558 US breweries. The datasets are as follows:
Beers.csv: * Name: Name of the beer. * Beer_ID: Unique identifier of the beer. * ABV: Alcohol by volume of the beer. * IBU: International Bitterness Units of the beer. * Brewery_ID: Brewery id associated with the beer. * Style: Style of the beer. * Ounces: Ounces of beer.
Breweries.csv: * Brew_ID: Unique identifier of the brewery. * Name: Name of the brewery. * City: City where the brewery is located. * State: U.S. State where the brewery is located.
This report is tasked with: * analyzing the number of breweries in each state in the US * correcting missing IBU Data * analyzing minimum, median, and maximum ABV and IBU for each state * providing general summary statistics for ABV * determining if a relationship between IBU and ABV exists * analyzing the IBU and ABV relationship for IPA’s vs other Ales * providing other meaningful insight
## ── Attaching packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2 ✓ purrr 0.3.4
## ✓ tibble 3.0.3 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
##
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
##
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
##
## filter
## The following objects are masked from 'package:base':
##
## cbind, rbind
## Loading required package: colorspace
## Loading required package: grid
## VIM is ready to use.
## Suggestions and bug-reports can be submitted at: https://github.com/statistikat/VIM/issues
##
## Attaching package: 'VIM'
## The following object is masked from 'package:datasets':
##
## sleep
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
Note: we use the merge data in Question 1 and therefor we need to perform step 2 first. # 2. Merge beer data first with the breweries data & Print first 6 and last 6 oservations in merged file.
| Brewery_id | Drink_name | Beer_ID | ABV | IBU | Style | Ounces | Brewery | City | State |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Get Together | 2692 | 0.045 | 50 | American IPA | 16 | NorthGate Brewing | Minneapolis | MN |
| 1 | Maggie’s Leap | 2691 | 0.049 | 26 | Milk / Sweet Stout | 16 | NorthGate Brewing | Minneapolis | MN |
| 1 | Wall’s End | 2690 | 0.048 | 19 | English Brown Ale | 16 | NorthGate Brewing | Minneapolis | MN |
| 1 | Pumpion | 2689 | 0.060 | 38 | Pumpkin Ale | 16 | NorthGate Brewing | Minneapolis | MN |
| 1 | Stronghold | 2688 | 0.060 | 25 | American Porter | 16 | NorthGate Brewing | Minneapolis | MN |
| 1 | Parapet ESB | 2687 | 0.056 | 47 | Extra Special / Strong Bitter (ESB) | 16 | NorthGate Brewing | Minneapolis | MN |
| Brewery_id | Drink_name | Beer_ID | ABV | IBU | Style | Ounces | Brewery | City | State | |
|---|---|---|---|---|---|---|---|---|---|---|
| 2405 | 556 | Pilsner Ukiah | 98 | 0.055 | NA | German Pilsener | 12 | Ukiah Brewing Company | Ukiah | CA |
| 2406 | 557 | Heinnieweisse Weissebier | 52 | 0.049 | NA | Hefeweizen | 12 | Butternuts Beer and Ale | Garrattsville | NY |
| 2407 | 557 | Snapperhead IPA | 51 | 0.068 | NA | American IPA | 12 | Butternuts Beer and Ale | Garrattsville | NY |
| 2408 | 557 | Moo Thunder Stout | 50 | 0.049 | NA | Milk / Sweet Stout | 12 | Butternuts Beer and Ale | Garrattsville | NY |
| 2409 | 557 | Porkslap Pale Ale | 49 | 0.043 | NA | American Pale Ale (APA) | 12 | Butternuts Beer and Ale | Garrattsville | NY |
| 2410 | 558 | Urban Wilderness Pale Ale | 30 | 0.049 | NA | English Pale Ale | 12 | Sleeping Lady Brewing Company | Anchorage | AK |
See Table:
## `summarise()` ungrouping output (override with `.groups` argument)
## [1] "Total Unique Breweries: "
## [1] 558
## Warning in plot.aggr(res, ...): not enough horizontal space to display
## frequencies
##
## Variables sorted by number of missings:
## Variable Count
## IBU 0.417012448
## ABV 0.025726141
## Style 0.002074689
## Brewery_id 0.000000000
## Drink_name 0.000000000
## Beer_ID 0.000000000
## Ounces 0.000000000
## Brewery 0.000000000
## City 0.000000000
## State 0.000000000
## Brewery_id Drink_name Beer_ID ABV
## Min. : 1.0 Length:2410 Min. : 1.0 Min. :0.00100
## 1st Qu.: 94.0 Class :character 1st Qu.: 808.2 1st Qu.:0.05000
## Median :206.0 Mode :character Median :1453.5 Median :0.05600
## Mean :232.7 Mean :1431.1 Mean :0.05977
## 3rd Qu.:367.0 3rd Qu.:2075.8 3rd Qu.:0.06700
## Max. :558.0 Max. :2692.0 Max. :0.12800
## NA's :62
## IBU Style Ounces Brewery
## Min. : 4.00 Length:2410 Min. : 8.40 Length:2410
## 1st Qu.: 21.00 Class :character 1st Qu.:12.00 Class :character
## Median : 35.00 Mode :character Median :12.00 Mode :character
## Mean : 42.71 Mean :13.59
## 3rd Qu.: 64.00 3rd Qu.:16.00
## Max. :138.00 Max. :32.00
## NA's :1005
## City State
## Length:2410 CO : 265
## Class :character CA : 183
## Mode :character MI : 162
## IN : 139
## TX : 130
## OR : 125
## (Other):1406
-Add style data for 2527 and 1635 by looking it up by hand. -Add IBU and ABV Data for many missing rows by looking up by hand (online via BeerAdvocate.com or Untappd.com)
## Matching, by = "Beer_ID"
| Brewery_id | ABV | IBU | Drink_name | Style | Ounces | Brewery | City | State | |
|---|---|---|---|---|---|---|---|---|---|
| Min. : 1.0 | Min. :0.00100 | Min. : 3.57 | Length:2400 | Length:2400 | Min. : 8.40 | Length:2400 | Length:2400 | CO : 261 | |
| 1st Qu.: 94.0 | 1st Qu.:0.05000 | 1st Qu.: 21.00 | Class :character | Class :character | 1st Qu.:12.00 | Class :character | Class :character | CA : 183 | |
| Median :206.5 | Median :0.05600 | Median : 35.00 | Mode :character | Mode :character | Median :12.00 | Mode :character | Mode :character | MI : 161 | |
| Mean :232.6 | Mean :0.05969 | Mean : 42.59 | NA | NA | Mean :13.58 | NA | NA | IN : 139 | |
| 3rd Qu.:367.0 | 3rd Qu.:0.06700 | 3rd Qu.: 64.00 | NA | NA | 3rd Qu.:16.00 | NA | NA | TX : 129 | |
| Max. :558.0 | Max. :0.12800 | Max. :138.00 | NA | NA | Max. :32.00 | NA | NA | OR : 125 | |
| NA | NA | NA’s :976 | NA | NA | NA | NA | NA | (Other):1402 |
## `summarise()` ungrouping output (override with `.groups` argument)
## [1] Style Brewery_id ABV
## [4] Drink_name Ounces Brewery
## [7] City State median_IBU_by_style
## [10] IBU.clean
## <0 rows> (or 0-length row.names)
| Style | Brewery_id | ABV | Drink_name | Ounces | Brewery | City | State | median_IBU_by_style | IBU.clean | |
|---|---|---|---|---|---|---|---|---|---|---|
| Length:2348 | Min. : 1 | Min. :0.02700 | Length:2348 | Min. : 8.40 | Length:2348 | Length:2348 | CO : 258 | Min. : 8.00 | Min. : 3.57 | |
| Class :character | 1st Qu.: 92 | 1st Qu.:0.05000 | Class :character | 1st Qu.:12.00 | Class :character | Class :character | CA : 181 | 1st Qu.:21.00 | 1st Qu.: 21.00 | |
| Mode :character | Median :204 | Median :0.05600 | Mode :character | Median :12.00 | Mode :character | Mode :character | MI : 146 | Median :30.00 | Median : 32.00 | |
| NA | Mean :231 | Mean :0.05967 | NA | Mean :13.56 | NA | NA | IN : 139 | Mean :40.03 | Mean : 40.46 | |
| NA | 3rd Qu.:366 | 3rd Qu.:0.06700 | NA | 3rd Qu.:16.00 | NA | NA | TX : 129 | 3rd Qu.:69.00 | 3rd Qu.: 60.00 | |
| NA | Max. :558 | Max. :0.12800 | NA | Max. :32.00 | NA | NA | OR : 115 | Max. :96.00 | Max. :138.00 | |
| NA | NA | NA | NA | NA | NA | NA | (Other):1380 | NA | NA |
##
## Variables sorted by number of missings:
## Variable Count
## Style 0
## Brewery_id 0
## ABV 0
## Drink_name 0
## Ounces 0
## Brewery 0
## City 0
## State 0
## median_IBU_by_style 0
## IBU.clean 0
## Style Brewery_id ABV Drink_name
## Length:2348 Min. : 1 Min. :0.02700 Length:2348
## Class :character 1st Qu.: 92 1st Qu.:0.05000 Class :character
## Mode :character Median :204 Median :0.05600 Mode :character
## Mean :231 Mean :0.05967
## 3rd Qu.:366 3rd Qu.:0.06700
## Max. :558 Max. :0.12800
##
## Ounces Brewery City State
## Min. : 8.40 Length:2348 Length:2348 CO : 258
## 1st Qu.:12.00 Class :character Class :character CA : 181
## Median :12.00 Mode :character Mode :character MI : 146
## Mean :13.56 IN : 139
## 3rd Qu.:16.00 TX : 129
## Max. :32.00 OR : 115
## (Other):1380
## median_IBU_by_style IBU.clean
## Min. : 8.00 Min. : 3.57
## 1st Qu.:21.00 1st Qu.: 21.00
## Median :30.00 Median : 32.00
## Mean :40.03 Mean : 40.46
## 3rd Qu.:69.00 3rd Qu.: 60.00
## Max. :96.00 Max. :138.00
##
## `summarise()` ungrouping output (override with `.groups` argument)
## [1] "Total Unique Breweries: "
## [1] 558
| Style | Brewery_id | ABV | Drink_name | Ounces | Brewery | City | State | median_IBU_by_style | IBU.clean |
|---|---|---|---|---|---|---|---|---|---|
| Quadrupel (Quad) | 52 | 0.128 | Lee Hill Series Vol. 5 - Belgian Style Quadrupel Ale | 19.2 | Upslope Brewing Company | Boulder | CO | 24 | 24 |
| English Barleywine | 2 | 0.125 | London Balling | 16.0 | Against the Grain Brewery | Louisville | KY | 60 | 80 |
| Russian Imperial Stout | 18 | 0.120 | Csar | 16.0 | Tin Man Brewing Company | Evansville | IN | 94 | 90 |
| Rye Beer | 52 | 0.104 | Lee Hill Series Vol. 4 - Manhattan Style Rye Ale | 19.2 | Upslope Brewing Company | Boulder | CO | 57 | 57 |
| Baltic Porter | 47 | 0.100 | 4Beans | 12.0 | Sixpoint Craft Ales | Brooklyn | NY | 52 | 52 |
| American Barleywine | 310 | 0.099 | Old Devil’s Tooth | 12.0 | Sockeye Brewing Company | Boise | ID | 96 | 100 |
## [1] "highest ABV State"
## Style Brewery_id ABV
## 1 Quadrupel (Quad) 52 0.128
## Drink_name Ounces
## 1 Lee Hill Series Vol. 5 - Belgian Style Quadrupel Ale 19.2
## Brewery City State median_IBU_by_style IBU.clean
## 1 Upslope Brewing Company Boulder CO 24 24
| Style | Brewery_id | ABV | Drink_name | Ounces | Brewery | City | State | median_IBU_by_style | IBU.clean |
|---|---|---|---|---|---|---|---|---|---|
| American Double / Imperial IPA | 375 | 0.082 | Bitter Bitch Imperial IPA | 12 | Astoria Brewing Company | Astoria | OR | 90.5 | 138 |
| American IPA | 345 | 0.059 | Troopers Alley IPA | 12 | Wolf Hills Brewing Company | Abingdon | VA | 69.0 | 135 |
| American Double / Imperial IPA | 231 | 0.090 | Dead-Eye DIPA | 16 | Cape Ann Brewing Company | Gloucester | MA | 90.5 | 130 |
| American Double / Imperial IPA | 100 | 0.089 | Bay of Bengal Double IPA (2014) | 12 | Christian Moerlein Brewing Company | Cincinnati | OH | 90.5 | 126 |
| American Double / Imperial IPA | 273 | 0.080 | Heady Topper | 16 | The Alchemist | Waterbury | VT | 90.5 | 120 |
| American Double / Imperial IPA | 62 | 0.097 | Abrasive Ale | 16 | Surly Brewing Company | Brooklyn Center | MN | 90.5 | 120 |
## [1] "highest IBU State"
## Style Brewery_id ABV Drink_name
## 1 American Double / Imperial IPA 375 0.082 Bitter Bitch Imperial IPA
## Ounces Brewery City State median_IBU_by_style IBU.clean
## 1 12 Astoria Brewing Company Astoria OR 90.5 138
| Style | Brewery_id | ABV | Drink_name | Ounces | Brewery | City | State | median_IBU_by_style | IBU.clean | |
|---|---|---|---|---|---|---|---|---|---|---|
| Length:2348 | Min. : 1 | Min. :0.02700 | Length:2348 | Min. : 8.40 | Length:2348 | Length:2348 | CO : 258 | Min. : 8.00 | Min. : 3.57 | |
| Class :character | 1st Qu.: 92 | 1st Qu.:0.05000 | Class :character | 1st Qu.:12.00 | Class :character | Class :character | CA : 181 | 1st Qu.:21.00 | 1st Qu.: 21.00 | |
| Mode :character | Median :204 | Median :0.05600 | Mode :character | Median :12.00 | Mode :character | Mode :character | MI : 146 | Median :30.00 | Median : 32.00 | |
| NA | Mean :231 | Mean :0.05967 | NA | Mean :13.56 | NA | NA | IN : 139 | Mean :40.03 | Mean : 40.46 | |
| NA | 3rd Qu.:366 | 3rd Qu.:0.06700 | NA | 3rd Qu.:16.00 | NA | NA | TX : 129 | 3rd Qu.:69.00 | 3rd Qu.: 60.00 | |
| NA | Max. :558 | Max. :0.12800 | NA | Max. :32.00 | NA | NA | OR : 115 | Max. :96.00 | Max. :138.00 | |
| NA | NA | NA | NA | NA | NA | NA | (Other):1380 | NA | NA |
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `geom_smooth()` using formula 'y ~ x'
TODO: Speak to Assumptions for Linear Regression, and P-value and confidence interval for ABV estimate, scope of inference
## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced
## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced
##
## Call:
## lm(formula = IBU.clean ~ State + State * ABV + ABV, data = bdat.imputed.IBU.clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -79.146 -12.212 -1.991 12.028 87.085
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -129.29 37.59 -3.440 0.000593 ***
## StateAL 64.69 49.16 1.316 0.188368
## StateAR 160.47 72.28 2.220 0.026506 *
## StateAZ 120.87 40.10 3.014 0.002606 **
## StateCA 98.93 38.08 2.598 0.009428 **
## StateCO 114.12 37.95 3.007 0.002668 **
## StateCT 106.06 40.27 2.634 0.008500 **
## StateDC 61.86 49.86 1.241 0.214906
## StateDE 225.29 117.43 1.918 0.055178 .
## StateFL 70.90 40.51 1.750 0.080241 .
## StateGA 85.36 51.27 1.665 0.096033 .
## StateHI 94.06 43.05 2.185 0.028982 *
## StateIA 116.19 41.40 2.806 0.005056 **
## StateID 89.04 40.08 2.222 0.026413 *
## StateIL 86.22 38.63 2.232 0.025741 *
## StateIN 122.21 38.23 3.197 0.001410 **
## StateKS 96.74 41.84 2.312 0.020845 *
## StateKY 127.68 40.24 3.173 0.001529 **
## StateLA 86.83 42.71 2.033 0.042172 *
## StateMA 88.40 39.11 2.260 0.023891 *
## StateMD 116.52 43.51 2.678 0.007457 **
## StateME 95.31 40.22 2.370 0.017892 *
## StateMI 135.25 38.24 3.537 0.000413 ***
## StateMN 99.19 39.18 2.531 0.011425 *
## StateMO 75.51 41.31 1.828 0.067731 .
## StateMS 85.94 46.38 1.853 0.064002 .
## StateMT 85.41 43.54 1.962 0.049926 *
## StateNC 119.23 39.22 3.040 0.002394 **
## StateND 45.58 73.36 0.621 0.534452
## StateNE 106.82 41.52 2.573 0.010155 *
## StateNH 109.89 51.59 2.130 0.033284 *
## StateNJ 98.05 42.32 2.317 0.020596 *
## StateNM 31.91 50.27 0.635 0.525694
## StateNV 127.15 44.33 2.868 0.004167 **
## StateNY 100.31 38.69 2.592 0.009592 **
## StateOH 108.08 39.83 2.714 0.006706 **
## StateOK 95.50 42.82 2.230 0.025829 *
## StateOR 80.52 38.51 2.091 0.036651 *
## StatePA 137.92 38.60 3.573 0.000361 ***
## StateRI 117.62 41.30 2.848 0.004443 **
## StateSC 96.29 42.01 2.292 0.022006 *
## StateSD 140.02 66.71 2.099 0.035939 *
## StateTN 68.21 80.31 0.849 0.395799
## StateTX 84.98 38.36 2.215 0.026829 *
## StateUT 128.78 39.57 3.254 0.001153 **
## StateVA 81.82 41.03 1.994 0.046274 *
## StateVT 82.02 40.62 2.019 0.043611 *
## StateWA 121.64 39.69 3.065 0.002206 **
## StateWI 102.55 39.61 2.589 0.009677 **
## StateWV 19.39 169.13 0.115 0.908754
## StateWY 63.82 49.02 1.302 0.193128
## ABV 3001.54 672.17 4.465 8.38e-06 ***
## StateAL:ABV -1183.04 838.98 -1.410 0.158652
## StateAR:ABV -2985.79 1354.72 -2.204 0.027626 *
## StateAZ:ABV -2303.02 709.68 -3.245 0.001191 **
## StateCA:ABV -1785.93 679.02 -2.630 0.008593 **
## StateCO:ABV -2077.20 677.07 -3.068 0.002181 **
## StateCT:ABV -1951.42 710.11 -2.748 0.006043 **
## StateDC:ABV -1327.38 831.19 -1.597 0.110414
## StateDE:ABV -3801.54 1890.86 -2.010 0.044499 *
## StateFL:ABV -1337.25 716.63 -1.866 0.062166 .
## StateGA:ABV -1531.73 909.58 -1.684 0.092320 .
## StateHI:ABV -1792.26 765.74 -2.341 0.019342 *
## StateIA:ABV -2229.85 730.51 -3.052 0.002296 **
## StateID:ABV -1552.43 707.95 -2.193 0.028422 *
## StateIL:ABV -1599.72 686.70 -2.330 0.019917 *
## StateIN:ABV -2224.00 680.72 -3.267 0.001103 **
## StateKS:ABV -1790.34 744.49 -2.405 0.016262 *
## StateKY:ABV -2357.52 704.93 -3.344 0.000838 ***
## StateLA:ABV -1595.73 761.10 -2.097 0.036140 *
## StateMA:ABV -1609.45 698.31 -2.305 0.021271 *
## StateMD:ABV -2108.40 764.70 -2.757 0.005878 **
## StateME:ABV -1704.05 713.61 -2.388 0.017027 *
## StateMI:ABV -2527.87 681.02 -3.712 0.000211 ***
## StateMN:ABV -1666.19 695.38 -2.396 0.016653 *
## StateMO:ABV -1360.36 739.84 -1.839 0.066088 .
## StateMS:ABV -1477.17 809.50 -1.825 0.068165 .
## StateMT:ABV -1570.35 775.43 -2.025 0.042971 *
## StateNC:ABV -2191.86 697.28 -3.143 0.001691 **
## StateND:ABV -704.55 1331.48 -0.529 0.596756
## StateNE:ABV -1967.10 735.33 -2.675 0.007524 **
## StateNH:ABV -2023.81 943.99 -2.144 0.032149 *
## StateNJ:ABV -1648.93 743.89 -2.217 0.026748 *
## StateNM:ABV -545.71 862.92 -0.632 0.527193
## StateNV:ABV -2305.26 752.69 -3.063 0.002220 **
## StateNY:ABV -1778.79 689.84 -2.579 0.009985 **
## StateOH:ABV -1958.42 702.82 -2.787 0.005373 **
## StateOK:ABV -1852.86 752.00 -2.464 0.013817 *
## StateOR:ABV -1320.52 687.60 -1.920 0.054924 .
## StatePA:ABV -2483.76 687.23 -3.614 0.000308 ***
## StateRI:ABV -2246.67 733.43 -3.063 0.002215 **
## StateSC:ABV -1830.39 735.23 -2.490 0.012863 *
## StateSD:ABV -2675.35 1140.95 -2.345 0.019122 *
## StateTN:ABV -1175.32 1444.81 -0.813 0.416031
## StateTX:ABV -1598.07 683.80 -2.337 0.019525 *
## StateUT:ABV -2234.21 709.70 -3.148 0.001665 **
## StateVA:ABV -1384.88 729.61 -1.898 0.057811 .
## StateVT:ABV -1401.76 714.83 -1.961 0.050007 .
## StateWA:ABV -2113.67 706.96 -2.990 0.002822 **
## StateWI:ABV -2001.65 710.22 -2.818 0.004869 **
## StateWV:ABV -301.54 2734.91 -0.110 0.912216
## StateWY:ABV -1237.25 879.26 -1.407 0.159519
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.75 on 2246 degrees of freedom
## Multiple R-squared: 0.4258, Adjusted R-squared: 0.4
## F-statistic: 16.49 on 101 and 2246 DF, p-value: < 2.2e-16
To investigate the difference between IBU and ABV for IPA's vs OtherAles we first perform some nominal data cleanup and visualize IBU vs ABV for IPA's vs otherAles. We then use KNN to classify style, either IPA or otherAle, to highlight that there is a significant difference between the relationship of IBU and ABV for IPA's and otherAles
Ale. Then bucket anything with IPA or India Pale Ale as IPA and all other beers with the word Ale in their style as OtherAle.IPA's and otherAlesIPA's and otherAles for IPA’s and OtherALesNote: American Pale Ale is VERY similar to IPA but we call it “other” ale
## Abbey Single Ale Altbier
## 2 13
## American Adjunct Lager American Amber / Red Ale
## 18 132
## American Amber / Red Lager American Barleywine
## 28 3
## American Black Ale American Blonde Ale
## 36 108
## American Brown Ale American Dark Wheat Ale
## 70 7
## American Double / Imperial IPA American Double / Imperial Pilsner
## 105 2
## American Double / Imperial Stout American India Pale Lager
## 9 3
## American IPA American Pale Ale (APA)
## 423 244
## American Pale Lager American Pale Wheat Ale
## 38 96
## American Pilsner American Porter
## 25 67
## American Stout American Strong Ale
## 39 14
## American White IPA American Wild Ale
## 11 6
## Baltic Porter Belgian Dark Ale
## 6 11
## Belgian IPA Belgian Pale Ale
## 18 24
## Belgian Strong Dark Ale Belgian Strong Pale Ale
## 6 7
## Berliner Weissbier Bière de Garde
## 11 7
## Bock California Common / Steam Beer
## 7 6
## Chile Beer Cream Ale
## 3 29
## Czech Pilsener Doppelbock
## 28 7
## Dortmunder / Export Lager Dubbel
## 6 5
## Dunkelweizen English Barleywine
## 4 3
## English Bitter English Brown Ale
## 3 18
## English Dark Mild Ale English India Pale Ale (IPA)
## 6 13
## English Pale Ale English Pale Mild Ale
## 12 3
## English Stout English Strong Ale
## 2 4
## Euro Dark Lager Euro Pale Lager
## 5 2
## Extra Special / Strong Bitter (ESB) Flanders Oud Bruin
## 20 1
## Foreign / Export Stout Fruit / Vegetable Beer
## 6 49
## German Pilsener Gose
## 36 10
## Grisette Hefeweizen
## 1 40
## Herbed / Spiced Beer Irish Dry Stout
## 9 5
## Irish Red Ale Keller Bier / Zwickel Bier
## 12 3
## Kölsch Lager
## 42 1
## Light Lager Maibock / Helles Bock
## 12 5
## Märzen / Oktoberfest Milk / Sweet Stout
## 30 10
## Munich Dunkel Lager Munich Helles Lager
## 4 20
## Oatmeal Stout Old Ale
## 18 2
## Other Pumpkin Ale
## 1 23
## Quadrupel (Quad) Radler
## 4 3
## Roggenbier Russian Imperial Stout
## 2 11
## Rye Beer Saison / Farmhouse Ale
## 18 52
## Schwarzbier Scotch Ale / Wee Heavy
## 9 15
## Scottish Ale Scottish-Style Amber Ale
## 19 1
## Smoked Beer Tripel
## 1 11
## Vienna Lager Wheat Ale
## 20 1
## Winter Warmer Witbier
## 15 51
## Style Brewery_id ABV Drink_name Ounces
## 1 OtherAle 58 0.049 Abbey's Single (2015- ) 12
## 2 OtherAle 58 0.049 Abbey's Single Ale (Current) 12
## 3 OtherAle 361 0.061 Hot Rod Red 12
## 4 OtherAle 553 0.056 Mickey Finn's Amber Ale 12
## 5 OtherAle 102 0.052 Hurricane Amber Ale 12
## 6 OtherAle 83 0.052 Fat Tire Amber Ale (2011) 12
## Brewery City State median_IBU_by_style
## 1 Destihl Brewery Bloomington IL 22
## 2 Destihl Brewery Bloomington IL 22
## 3 Aviator Brewing Company Fuquay-Varina NC 30
## 4 Mickey Finn's Brewery Libertyville IL 30
## 5 Coastal Extreme Brewing Company Newport RI 30
## 6 New Belgium Brewing Company Fort Collins CO 30
## IBU.clean
## 1 22
## 2 22
## 3 41
## 4 30
## 5 24
## 6 18
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `geom_smooth()` using formula 'y ~ x'
## Length Class Mode
## 2348 character character
Split the data into 85% train and 15% test. Only train our algorithms on training data with crossvalidation. Use test split only for accuracy and prediction metric calculation.
IPA's vs Other Ales## Confusion Matrix and Statistics
##
## classifications
## IPA OtherAle
## IPA 67 7
## OtherAle 16 140
##
## Accuracy : 0.9
## 95% CI : (0.8537, 0.9355)
## No Information Rate : 0.6391
## P-Value [Acc > NIR] : < 2e-16
##
## Kappa : 0.778
##
## Mcnemar's Test P-Value : 0.09529
##
## Sensitivity : 0.8072
## Specificity : 0.9524
## Pos Pred Value : 0.9054
## Neg Pred Value : 0.8974
## Prevalence : 0.3609
## Detection Rate : 0.2913
## Detection Prevalence : 0.3217
## Balanced Accuracy : 0.8798
##
## 'Positive' Class : IPA
##
IPA's vs Other AlesUse least squared multiple linear regression to highlight specific relationships between IBU and ABV for style IPA and style otherAle. - NOTE: make the unit increase in interpretation be in terms of .01 unit increase in ABV
## 2.5 % 97.5 %
## (Intercept) 8.439664 22.633685
## StyleOtherAle -23.536657 -5.964116
## ABV 707.658144 912.217775
## StyleOtherAle:ABV -373.372852 -101.203617
##
## Call:
## lm(formula = IBU.clean ~ Style + Style * ABV + ABV, data = bdat.IPA.Vs.Ales.train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -47.902 -9.137 -2.282 8.286 76.991
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.537 3.618 4.295 1.88e-05 ***
## StyleOtherAle -14.750 4.479 -3.293 0.001016 **
## ABV 809.938 52.136 15.535 < 2e-16 ***
## StyleOtherAle:ABV -237.288 69.367 -3.421 0.000644 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.42 on 1296 degrees of freedom
## Multiple R-squared: 0.6573, Adjusted R-squared: 0.6566
## F-statistic: 828.8 on 3 and 1296 DF, p-value: < 2.2e-16
Here we use LDA and KNN to assess the relationship between IBU and Style as well as ABV and style.
It is clear that there is a significant relationship between IBU and ABV and that the relationship varies for IPA's and for other Ales. We were able to predict the style of Ale (either IPA or other Ales) with an average Accuracy of 90% (P-value < 2e-16). Further, we can be 95% confident that the true accuracy for our model is between [0.8537, 0.9355]. On average, holding all other variables constant, we predict that when an ale is an IPA, it has a 2.373 increase in IBU per .01 increase in relative ABV, when compared to Other Ales. That is to say, IPA’s generally have a higher bitterness for a given ABV than other ales and the ratio of IBU to ABV is generally higher for IPA’s. We are 95% confident that this ‘IPA’ effect is between [1.01203617 3.73372852] per .01 increase in ABV. This ‘IPA’ effect applies to all craft beers sampled in the study, as well as all craft beers in the USA for which the beers sampled int the study are a good representation. Reasons for this ‘IPA’ effect could be simply that IPA’s, on average, use a higher ratio of hops in the brewing process and will generally have higher bitterness for the same ABV when compared to other Ales.
That said, it can be noted that IPA’s generally have a higher ABV (IBU not fixed). The skew of IPA’s toward higher ABV could also be due to the fact beer drinkers generally drink fewer IPA’s and are willing to spend more money on them. As such, to achieve the same ‘buzz’ the discerning IPA drinker will gravitate towards higher IBU AND Higher ABV.
To show that there is a relationship between State and Ounces for all Ales, specifically 12 vs 16 ounces, we perform a brief analysis of Ounces vs. State. What we are interested in here is whether or not a State like California or Michigan differ in their preference for 12 vs 16 ounce beers.
12 or 16 ounces in volume.12 vs 16 ounce Ales for each state.## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV
## Confusion Matrix and Statistics
##
## classifications
## 12 16
## 12 127 14
## 16 51 35
##
## Accuracy : 0.7137
## 95% CI : (0.6501, 0.7715)
## No Information Rate : 0.7841
## P-Value [Acc > NIR] : 0.9951
##
## Kappa : 0.3359
##
## Mcnemar's Test P-Value : 7.998e-06
##
## Sensitivity : 0.7135
## Specificity : 0.7143
## Pos Pred Value : 0.9007
## Neg Pred Value : 0.4070
## Prevalence : 0.7841
## Detection Rate : 0.5595
## Detection Prevalence : 0.6211
## Balanced Accuracy : 0.7139
##
## 'Positive' Class : 12
##
There is a significant relationship between Ounces and State for Ales that are either 12 vs 16 ounces. We were able to predict ounces by state (for ales) with an accuracy of 71.4% which is significantly better than random chance (50% accuracy). This suggests that when considering what size a beer should be sold in, it could be important to consider what state that beer is going to be brewed in. This helps align sales with the laws and preferences of that state. For example, Indiana has a high prevalence of 16 ounce ales, whereas Colorado or Texas both lean towards 12 ounce ales. It would be preferable to brew and sell 16 ounce beers in states like Indiana (or other states that have more 16 ounce ales) and 12 ounce beers in Colorado (or other states that have more 12 ounce ales). The reason for this association may be due to a variety of reasons including income, weather, local diet, and social norms for beer drinkers.
There are several note-worthy relationships in our craft beer dataset. For example, median IBU as well as median ABV per state vary by state for craft beers. When targeting sales of particular states’ craft beer, it would be wise to consider the median IBU and median ABV for the current craft beers in that state.
Further, it is clear that there is a significant relationship between IBU and ABV and that the relationship varies for IPA's and for other Ales. We were able to predict the style of Ale (either IPA or other Ales) with an average Accuracy of 90% (P-value < 2e-16). Further, we can be 95% confident that the true accuracy for our model is between [0.8537, 0.9355]. On average, holding all other variables constant, we predict that when an ale is an IPA, it has a 2.373 increase in IBU per .01 increase in relative ABV, when compared to Other Ales. That is to say, IPA’s generally have a higher bitterness for a given ABV than other ales and the ratio of IBU to ABV is generally higher for IPA’s. We are 95% confident that this ‘IPA’ effect is between [1.01203617 3.73372852] per .01 increase in ABV. This ‘IPA’ effect applies to all craft beers sampled in the study, as well as all craft beers in the USA for which the beers sampled in the study are a good representation. Reasons for this ‘IPA’ effect could be simply that IPA’s, on average, use a higher ratio of hops in the brewing process and will generally have higher bitterness for the same ABV when compared to other Ales.
That said, it can be noted that IPA’s generally have a higher ABV (IBU not fixed). The skew of IPA’s toward higher ABV could also be due to the fact beer drinkers generally drink fewer IPA’s and are willing to spend more money on them. As such, to achieve the same ‘buzz’ the discerning IPA drinker will gravitate towards higher IBU AND higher ABV.
There is also a significant relationship between Ounces and State for Ales that are either 12 vs 16 ounces. We were able to predict ounces by state (for ales) with an accuracy of 71.4% which is significantly better than random chance (50% accuracy). This suggests that when considering what size a beer should be sold in, it could be important to consider what state that beer is going to be brewed in. This helps align sales with the laws and preferences of that state. For example, Indiana has a high prevalence of 16 ounce ales, whereas Colorado or Texas both lean towards 12 ounce ales. It would be preferable to brew and sell 16 ounce beers in states like Indiana (or other states that have more 16 ounce ales) and 12 ounce beers in Colorado (or other states that have more 12 ounce ales). The reason for this association may be due to a variety of reasons including income, weather, local diet, and social norms for beer drinkers.
We recommend that Anheuser-Busch InBev stay consistent with established patterns for given states. To tend towards breweries which produce IPA’s and Ale’s that can be accurately predicted with our models above. To favor beers and breweries who have average ABV, IBU, and volume not dissimilar from established norms for that state. Given how strapped the market for craft beer and craft breweires is, the opportuinty is ripe.